Real-time multilingual HMM training robust to channel variations

نویسندگان

  • Ea-Ee Jan
  • Jaime Botella Ordinas
  • George Saon
  • Salim Roukos
چکیده

This paper describes our efforts towards real-time telephony multi-lingual Large Vocabulary Continuous Speech Recognition (LVCSR) system. The trilingual (English, French and Spanish) landline cellular hybrid systems is compared to each of our best monolingual systems. The results are very comparable. The degradation is approximately less than 10%. A HMM state quality measurement technique is explored to improve the performances on multilingual acoustic models. A pilot experiment on English/Spanish bilingual system demonstrates very good results. We achieved between 5% to 20% improvement on different test conditions. To further extend to speaker phone applications, we employed different front-end processing techniques, mainly CDCN prior to HDA and MLLT to reduce the error rate on the trilingual system by as many as 30%. These results suggest that trilingual acoustic models can be used for real telephony applications.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Very Low Resource Radio Browsing for Agile Developmental and Humanitarian Monitoring

We present a radio browsing system developed on a very small corpus of annotated speech by using semi-supervised training of multilingual DNN/HMM acoustic models. This system is intended to support relief and developmental programmes by the United Nations (UN) in parts of Africa where the spoken languages are extremely under resourced. We assume the availability of 12 minutes of annotated speec...

متن کامل

شبکه عصبی پیچشی با پنجره‌های قابل تطبیق برای بازشناسی گفتار

Although, speech recognition systems are widely used and their accuracies are continuously increased, there is a considerable performance gap between their accuracies and human recognition ability. This is partially due to high speaker variations in speech signal. Deep neural networks are among the best tools for acoustic modeling. Recently, using hybrid deep neural network and hidden Markov mo...

متن کامل

A robust environment-effects suppression training algorithm for adverse Mandarin speech recognition

In this paper, a new robust training algorithm for the generation of a set of bias-removed, noise-suppressed reference speech HMM models directly from a training database collected in adverse environment suffering with both convolutional channel bias and additive noise is proposed. Its main idea is to incorporate a signal biascompensation operation and a PMC noise-compensation operation into it...

متن کامل

Multilingual speech recognition A posterior based approach

Modern automatic speech recognition (ASR) systems are based on parametric statistical models such as hidden Markov models (HMMs), exploiting 1) acoustic-phonetic models, which need to be trained on large amount of acoustic data, 2) a language model, which needs to be trained on large amount of text data and, finally, 3) a lexicon with phonetic transcription which requires linguistic expertise. ...

متن کامل

Robust HMM training for unified dutch and German speech recognition

This paper describes our recent work in developing an unified Dutch and German speech recognition system in the SpeechDat domain. The acoustic component of the multiligual system is accomplished through sharing common phonemes without preserving any information about the languages. We propose a more robust MCE-based training algorithm, where only the language dependent phoneme models are allowe...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000